Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump python sdk version #827

Open
wants to merge 42 commits into
base: main
Choose a base branch
from
Open

Bump python sdk version #827

wants to merge 42 commits into from

Conversation

eric-wang-1990
Copy link
Collaborator

@eric-wang-1990 eric-wang-1990 commented Oct 14, 2024

Upgrade python sdk version from 0.17.0 to 0.36.0
Create DatabricksCredentialManager to managed credentials, under the hood use Config from databricks.sdk.core to manage the auth part which comes from the Python SDK.
For dbt we support 4 different auth mode:

  • token : When user explicitly passed in a token=xxxx
  • external_browser: When user pass in auth_type=oauth with clientid=xxx and no client secret
  • oauth-m2m: When user pass in clientid=xxx and clientsecret=yyy
  • azure-client-secret: When user pass in azure_client_id=xxx and azure_client_secret=yyy and your host is an azure host

For oauth-m2m and azure-client-secret we set a preferred auth sequence based on if the clientSecret starts with "dose", so we can reduce the number of false trying as much as possible.

Remove the set/get sharedPassword part, since the python sdk already handles that, but with one defect which I already raise a PR against: databricks/databricks-sdk-py#823

With this change one thing that will no longer work is if customer are using non-databricks OAuth client for U2M case it will not work, since the python sdk does not support modify redirectUrl/scopes.

Description

Checklist

  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt-databricks next" section.

benc-db and others added 26 commits June 24, 2024 13:24
… config file (#724)

Signed-off-by: Artur Peedimaa <[email protected]>
Co-authored-by: Artur Peedimaa <[email protected]>
Co-authored-by: Ben Cassell <[email protected]>
Signed-off-by: Dmitry Volodin <[email protected]>
Co-authored-by: Ben Cassell <[email protected]>
Signed-off-by: Roy Dobbe [email protected]
Co-authored-by: Roy Dobbe <[email protected]>
Signed-off-by: Kyle Valade <[email protected]>
Co-authored-by: Kyle Valade <[email protected]>
Co-authored-by: Ben Cassell <[email protected]>
self._lock.acquire()
try:
assert self._credentials_manager is not None, "Credentials manager is not set."
return self._credentials_manager
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is lock still required?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am just honoring the existing behavior. If there are multiple connections using the same DatabricksCredentials instance, there is a shared resource _config which can have concurrent write. @benc-db I think you introduce the original lock, is it still needed?

auth_type = (
"external-browser" if not self.client_secret
# if the client_secret starts with "dose" then it's likely using oauth-m2m
else "oauth-m2m" if self.client_secret.startswith("dose")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can it wrapped in a method like is_databricks_m2m? And this checks still a little bit tricky, do we really want to infer it from secret format? Can it be explicitly specified by the config to decide whether azure or databricks m2m should be used?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to what Jacky said. In design review we decided to not depend on secret format, but instead to a.) reference if a config flag is set to tell us to use AAD, and b.) if we fail with Databricks OAuth but succeed with AAD, emit a warning telling them to set the config flag.

@jackyhu-db
Copy link
Collaborator

Thanks @eric-wang-1990 for updating this. Can your rename the PR title to make it more readable?

@benc-db
Copy link
Collaborator

benc-db commented Nov 15, 2024

Hi Eric, since the last time we were looking at this PR, 1.9 has become the target in main. So please rebase and target main with PR.

Copy link
Collaborator

@benc-db benc-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove dependence on secret format, per internal design meeting.


def set_sharded_password(self, service_name: str, username: str, password: str) -> None:
max_size = MAX_NT_PASSWORD_SIZE
Without this, a local .netrc file in the user's home directory
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to remove it seems databricks cli will not use .netrc anymore? https://docs.databricks.com/en/dev-tools/cli/profiles.html @benc-db @jackyhu-db

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think .netrc is used by python requests which might be used by databricks-sdk code as well?

self._lock.acquire()
try:
assert self._credentials_manager is not None, "Credentials manager is not set."
return self._credentials_manager
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am just honoring the existing behavior. If there are multiple connections using the same DatabricksCredentials instance, there is a shared resource _config which can have concurrent write. @benc-db I think you introduce the original lock, is it still needed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants